Kolmogorov-Arnold Networks

Fabian Ruehle (Northeastern University)

17-May-2024, 07:00-08:00 (20 months ago)

Abstract: We introduce Kolmogorov-Arnold Networks (KANs) as an alternative to standard feed-forward neural networks. KANs are based on Kolmogorov-Arnold representation theory, which means for our purposes that we can represent any function we want to learn by a weighted sum over basis functions, taken to be splines. In contrast to standard MLPs, the function basis of KANs is fixed to be piecewise polynomial rather than a combination of weights and non-linearities, and we only learn the parameters that control the individual splines. While this is more expensive than a standard MLP, KANs have two properties that can offset this cost. First, KANs can typically work with much fewer parameters. Second, they exhibit better neural scaling laws, meaning the error decreases faster when increasing the number of parameters as compared to MLPs. Fewer parameters also mean that KANs are much more interpretable, especially when combined with the sparsification and pruning techniques we introduce. This makes KANs interesting as tools for symbolic regression and for scientific discovery. We discuss an example from knot theory, where we could recover (trivial and non-trivial) relations among knot invariants.

mathematical physicsalgebraic geometryanalysis of PDEsdynamical systemsgeneral mathematicsnumber theoryprobability

Audience: researchers in the topic

Comments: Register for zoom link: us06web.zoom.us/meeting/register/tZ MkcOqtrDMuGNAQcMlvp3-MJwcWXVU6fzXl


UNIST Mathematical Sciences Seminar Series

Organizer: Rak-Kyeong Seong*
*contact for this listing

Export talk to